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ABSTRACT 
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Professional Issues in the Evaluation of Education/Training Programs 



Program evaluation has been something of an outcast in social science 
research, despite attempts by Cronbach, Scriven, Stake, Suchman, and 
others to make it respectable. The relative lack of generalizability , 
apparent in most evaluations, but less apparent (although often as real) 
in other social science research may be one reason why program evaluation 
is held in low esteem. But it is not the whole story. For one thing, 
program evaluation has been given insufficient support and emphasis; 
for another, it has frequently been left to inadequately trained personnel 
to carry out. Many of those with the ability and scientific background 
to conduct competent evaluations have lacked either the interest or 
experience to make the contributions to the field that they were capable 
of. 

In our view, the differences between research and evaluation are more 
differences of purpose and style than of great substance. To polarize 
the situation, some argue that research holds promise for future program 
development; evaluation assumes current or past program development. 



* 

This report was stimulated by a meeting held at Educational Testing 
Service under the auspices of the Office of Naval Research (Contract No. 
N 0014-72-O0433, NR 154-359). Participants included Lee J. Cronbach, 
Joel Davitz, Henry S. Dyer, Henry M. Levin, Robert Perloff , Seymour Sarason, 
Michael Scriven, Robert E. Stake, Julian C. Stanley, Melvin M. Tumin; 
Marshall J. Farr, Joseph L. Young, ONR; Ernest J. Anastasio, Albert E. Beaton, 
Paul B. Campbell, Garlie Forehand, Norman Frederiksen, J. Richard Harsh, 
Dean Jamison, Frederic M. Lord, Albert P. Maslow, Samuel J. Messick, 
Richard T. Murphy, Charles E. Scholl, William W. Turnbull, ETS. However, 
the opinions expressed in the report are the authors 1 and do not necessarily 
reflect those of the participants or the Office of Naval Research. 
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Research offers to extend our knowledge about abstract principles; 
evaluation offers to extend our knowledge about specific practices. 
Research provides generalizable knowledge without necessarily providing 
immediate payoff; evaluation provides immediate payoff without necessarily 
providing generalizable knowledge. Research is knowledge oriented;* 
evaluation is decision oriented. However, such distinctions are frequently 
arbitrary and seldom sharp. A well-conceived evaluation study can yield 
information useful for improving a specific program and also contribute 
to our general knowledge about how people learn. 

the purpose of this paper is to present a codification of some 
evaluation principles and a framework for appropriate evaluation practices. 
We hope this effort will enable both experienced and neophyte evaluators 
to understand their profession more comprehensively and practice it more 
systematically. The greater systemization suggested here may also help 
to combat some prejudices against program evaluation as a worthy activity 
for social scientists. Specifically, we shall try to: 

. Delineate the most common purposes of evaluation efforts and 

indicate the general methods of investigation that are most apt 

for each purpose (Table 1). 
• Highlight some of the types and sources of evidence frequently 

associated with the general methods of investigation (Table 2). 
. Classify the types of administrative and fiscal relationships 

that may exist among the evaluator, the funding source, and the 

program developer /director (Table 3). 
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. Provide a checklist of potential audiences for evaluation results, 
indicating the most appropriate communication forms for each 
audience group (Table 4). 

• Suggest a means for defining and communicating some typical 
ideologies and value orientations of evaluators (Table 5). 

• Provide a comprehensive list of competencies which evaluators need 
to varying degrees, as an aid to evaluating and training 
evaluators (Table 6). 

. Outline the ethical responsibilities of evaluators <■ and related 
groups (Table 7). 

The tables referred to above are the heart of this presentation and 
should be studied in some detail. 

1. Evaluation Purposes and Methods 
Evaluations are undertaken for a great many reasons or purposes. 
These mandate areas of involvement for the evaluator attempting to provide 
relevant information for decision making. We have distinguished six 
major purposes or areas of involvement, and each of these six have been 
broken down into a number of components, as shown in Table 1. Table 1 
also includes a matching of^ evaluation purposes-components to likely 
general methods of investigation. Let us examine each of the six major 
evaluation purposes in turn, 

I. To contribute to decisions about program installation . Historically 
the evaluation process has been thought of as beginning after the decision 
to implement an education/ training program. However, a number of the 



* The content of this list of evaluation purposes benefitted from 
Scriven's (1974) "Product Checklist/' It should be noted, however, that 
Scriven's list is designed to be used primarily for appraising completed 
educational products or evaluation proposals, while Table 1 is intended 
as an aid to overall evaluation planning. 
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Table 1 

Purpose and General Methods 
of Program Evaluation 



likely investigation method 



I. To contribute to decisions about program installation 

A. Need 

1 . Frequency 

a. Student 

b. Society 

c. Other (e.g., industrial, 
professional, governmental) 

2. Intensity 

a. Student 

b. Society 

c. Other 

B. Program conception 

1 . Appropriateness 

2. Quality 

3. Priority in the face of competing needs 

C. Estimates Cost 

1 . Absolute cost 

2. Cost in relation to alternative strategies 
oriented toward same need 

D. Operational feasibility 

1. Staff 

2. Materials 

3. Facilities 

4. Schedule 

E. Projection of demand and support 

1 . Popular 

2. Political/financial 

3. Professional 

II. To contribute to decisions about program continuation, 
expansion, and/or "accreditation" 

A. Continuing Need 

1 . Frequency 

a. Student 

b. Society 

c. Other 

2. Intensity 

a. Student 

b. Society 

c. Other 

B. Global effectiveness in meeting need 

1 . Short-term 

2. Long-term 

C. Minimal negative side-effects 

D. Important positive side-effects 

E. Cost 

1 . Absolute cost 

2. Cost in relation to alternative strategies 
oriented toward same need 

3. Cost in relation to benefits 
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Table 1 (continued) 



F. Demand and support 

1. Popular 

2. Political/financial 

3. Professional 

ML To contribute to decisions about program modification 

A. Program objectives 

1 . Validity and utility (in meeting needs) 

2. Popular acceptance 

3. Professional acceptance 

4. Student acceptance 

5. Instructor acceptance 

B. Curriculum content 

1. Relevance to program objectives 

2. Coverage of objectives 

3. Technical accuracy 

4. Degree of structure 

5. Relevance to backgrounds of students 

6. Effectiveness of components 
.7. Sequencing of component 

8. Difficulty 

9. Popular acceptance 

10. Professional acceptance 

11. Student acceptance 

12. Instructor acceptance 

C. Instructional methodology 

1. Degree of student autonomy 

2. Effectiveness of presentation methods 

3. Pacing and length 

4. Reinforcement system 

5. Student acceptance 

6. Instructor acceptance 

D. Program context 

1. Administrative structure, auspices 

2. Program administration procedures 

3. Staff roles and relationships 

4. Public relations efforts 

5. Physical facilities and plant 

6. Fiscal sources and stability 

7. Fiscal administration procedures 

E. Personnel policies and practices 

1 . Students 

a. Recruitment 

b. Selection and placement 

c. Evaluation 

d. Discipline 

e. Retention 

2. Instructors 

a. Selection and placement 

b. In-service training 

c. Evaluation for promotion, guidance, 
retention, etc. 
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Table 1 (continued) 



3. Administrators 

a. Selection 

b. Evaluation for promotion, retention, etc. 

IV. To obtain evidence favoring program to rally support 

A. Popular 

B. Political/financial 

C. Professional 

V. To obtain evidence against program to rally opposition 

A. Popular 

B. Political/financial 

C. Professional 

VI. To contribute to the understanding of basic processes 

A. Educational 

B. Psychological 

C. Social 

D. Economic 

E. Evaluation (Methodology) 
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skills and techniques usually associated with evaluation of existing 
or planned programs are applicable to what Harless (1973) has called 
"Front-end analysis. 11 Assessment of the frequency and/or intensity 
of needs for a program, evaluation of the initial conception, and 
estimates of costs, operational feasibility, and demand and support are 
all important precursors to decisions about whether to implement a 
program and about the size and scope of the installation. 

II • To contribute to decisions about program continuation, 
expansion (or contraction), and/or "accreditation." This purpose is 
the one usually served by what is popularly called "summative evaluation"; 
however, more is included here than is sometimes intended by that term. 
For example, investigations under Purpose II may involve some of the 
same components as investigations under Purpose I; after a program is 
in operation, it is important to monitor the continuing needs for the 
program (some of them may change or even go away) and to assess actual 
costs and demand/support. Results of these investigations need to be 
considered along with results of impact studies (focusing on both intended 
and unintended outcomes) in making decisions about program continuation, 
expansion, or "accreditation. " 

III. To contribute to decisions about program modification . This 
purpose corresponds to the one usually ascribed t:o formative evaluation, 
although information about program components can also be obtained after 
a program is in full operation and in the context of a global appraisal 
of effectiveness. Of course, if a program is cast in an unchangeable mold, 
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the evaluator is wasting his time seeking information to help make it 
better. A major distinction between evaluation efforts devoted to 
Purpose III, as opposed to Purpose II, is in the emphasis on describing 
program processes in contrast to program products. As Table 1 indicates, 
the evaluator may seek information to guide program improvement in a 
broad range of areas, including program objectives (e.g., validity and 
utility in meeting needs, popular acceptance), curriculum content 
(e.g., relevance to objectives, technical accuracy), instructional 
methodology (e.g., degree of student autonomy, pacing), program context 
(e.g., administrative structure, staff roles), and personnel policies and 
practices (e.g., student recruitment, instructor selection). 

IV. and V. To obtain evidence favoring a program to rally support 
or To obtain evidence against a program to rally opposition . These two 
purposes are presented in recognition of the realities of program 
evaluation. Many evaluators shun evaluations with these purposes; many 
people who "commission" evaluations are unwilling to admit to their real 
motives. But there are indeed occasions when decision makers must rally 
support for a program in order to sustain it, or opposition to it in order 
to "kill it" so that funds can be diverted to other things. And there may 
be occasions when decision makers are willing to entertain both negative 
and positive evidence about the effectiveness of a program. The adversary 
model of evaluation integrates this purpose with the full thrust of the 
evaluation effort (Churchman, 1961; Stake & Gjerde, 1971; Levine, 1973) . 
In any case, it is better if the evaluator's client faces up to the real 
reasons for the evaluation and does not keep them hidden from the 
evaluator. The evaluator ? s responsibilities, in turn, include defining 
clearly the nature of the evidence being presented, indicating its lack 




of representativeness if that is indeed the case, and ensuring the 
validity of the evidence even if it is only a partial picture of the 
total state of affairs. 

VI. To contribute to the understanding of basic processes . 
Pursuing the purposes of a decision-oriented evaluation does not preclude 
investigating, within the context of the same study, basic processes in 
at least one of the disciplines listed under Purpose VI, Table 1. 
However, evaluators cannot afford to lose sight of the fact that the 
program must be the central focus. A search for understanding of basic 
processes can be a means to sharpen the focus of the investigation. 

We have pointed out that Table 1 includes a conjunction of evaluation 
purposes and general Methods of investigation. Eight general investigatory 
methods are listed: experimental studies, quasi-experimental studies 
(including studies where correlations/predictions serve as dependent 
variables, as well as the more usual means), correlational status studies 
(where no available manipulation occurs at all, and the data to be 
correlated are generally collected concurrently), surveys (e.g., of 
attitudes toward the program, records. of program operations), personnel 
or student assessments (using tests and other measurement devices with 
the staff or students involved), systematic "expert" judgments (e.g., 
ratings), clinical or case studies (focusing on particular students, 
subgroups of students, program components, etc.), and informal 
observation and/or testimony. We should remind ourselves that the last 
method was most prevalent form of program evaluation until 
very recently. How many of the textbooks that we used were adopted on 
the basis of anything other thar, testimony? 



In Table 1, we have indicated, for example, that the most likely 
methods to be used to assess the frequency of student needs (see I. A) 
are surveys, student assessments, and systematic u expert n judgments. 
If we were investigating the intensity of student needs, we might very 
well also accept data from case studies and testimony. (The urgency 
for a remedial reading program for a large number of 8th graders 
reading at 6th-grade level would be very different from the urgency 
for a program to help the handful reading at 2nd-grade level.) It will 
be rioted in Table 1 that every time'an evaluation purpose calls for an 
estimate of program or component effectiveness (e.g., II. B, III.C.2), 
an experimental or quasi-experimental study is suggested as the most 
likely (and appropriate) method of investigation. The relationship 
of the general methods to Purposes IV and V has a slightly different 
meaning from relationships to the other purposes. Here we must ask: 
What kind of evidence is most likely to rally support for (or opposition 
to) the program? We suggest that a professional audience would be less 
swayed by survey or assessment data than a lay audience wouldbe, but 
that the public would join professionals in respecting relatively M hard M 
evidence. 

No claim is made that our designations of likely methods of inves- 
tigation for particular evaluation purposes and objects are comprehensive 
or definitive. However, the evaluator and the administrator calling 
for an evaluation might well use Table 1 in the planning effort, for 
it at least provides a systematic way of considering the variety of 
purposes an evaluation might serve and the variables on which it might 
focus, as well as the general investigatory methods that might be employed 
to obtain information relevant to these purposes and variables. 
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2. Types. arid Sources of Evidence Frequently 

Associated with General Methods of Investigation 

Table 2 provides a second cross-tabulation. On one axis are the eight 
general methods of investigation initially presented in Table 1. Each of 
these eight methods has been augmented by examples of the types of 
evidence frequently offered by investigators using the general method. 
Thus the Survey method (IV) has as examples: A. Projections of manpower 
needs; B. Summaries of attitudes/opinions about the ongoing program 
expressed by students, instructors, others; and C. Descriptions of 
program characteristics, operations, costs. 

On the other axis are ten sources of evidence. These sources are not 
necessarily independent. For example, "Expert opinion 11 might be obtained 
via "Questionnaire or interview, 11 and "Social indicators" might be 
obtained through "Records." We have deliberately allowed some confounding 
here of kind of evidence and technique used for gathering it, in order to 
use terms which we hoped would best communicate the essence of the sources 
of evidence to evaluators and program directors. 

Within the cross-tabulation we have associated relevant sources of 
evidence with types of evidence typically presented under the different 
general methods of investigation. The associations provide our informal 
definition of "appropriate" sources of data for various types of evidence. 
However, they are meant to be suggestive, rather than prescriptive. From 
Table 2 it can be seen, for example, that we suggest that an investigator 
presenting correlations among student measures (III.D) might include 
' , v the following in his matrix: test scores, data derived from questionnaires 
questionnaires or interviews, grades (ratings), and/or results from 
clinical examinations. Or, an investigator conducting a case study 

15 
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Table 2 

Examples and Types of Sources 
of Evidence Frequently Associated 
with the Various General 
Methods of Investigation* 



r J Likely source of evidence 

I. Experimental study 

A. Differences between performance 
of students in the program and 
performance of other students 

B. Differences between performance of 
students exposed to program variations 

C. Data on differential program effects for 
students with different characteristics 

M. Quasi-experimental study 

A. Changes in student performance over 
the time of exposure to the program 

B. Changes in student performance 
for different program components, 
variations 

C. Differential predictions of "success" for 
students exposed and not exposed to 
the program 

III. Corralational status study 

A. Correlations between program 
characteristics (sometimes including 
costs) and student performance 

B. Correlations between student 
characteristics (such as race, sex) 
and student performance 

C. Correlations among program 
characteristics 

D. Correlations among student measures 

IV. Survey 

A. Projections of manpower needs 

B. Summaries of attitudes /opinions about 
the ongoing program expressed by 
students, instructors, others 

C. Descriptions of program characteristics, 
operations, costs 

V. Personnel or student assessment 

A. Profiles of characteristics of entering, 
leaving, past, or prospective students 

B. Summary descriptions of characteristics 
of program personnel 
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Table 2 (continued) 



VI. Systematic "expert" judgment 

A. Recommendations by a commission 
appointed to delineate a problem and 
recommend possible solutions 

B. Report of curriculum/materials review 
or evaluation panel 

C. Report of site visit to the program by a 
team of outside experts 

VII. Clinical or case study 

A. Analysis of program processes 
(implementation, management, 
evolution, etc.) 

B. Phenomenological analysis of 
institutional change 

C. Summary of impressions gained from 
examination of special student or 
personnel groups (e.g., referrals) 

VIII. Informal observation and/or testimony 

A. Anecdotes about experiences of 
particular students, instructors, etc. 
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a See also "Table 1 , Data Sources for Evaluation Efforts," in Anderson et al. (1975), Encyclopedia of Educational 
Evaluation, p. 116. Reference is also made (below) to Encylopedia pages for more complete definitions of 
many of the sources of evidence. 

D Tests include paper-and-pencil, situational, and performance tasks. See pp. 425-428. 

cSee pp. 214-217, 311-314. 

d Kept by participants during the course of the program. 
e See pp. 266-270. 

f Including grades, supervisors' ratings, expert opinions in the form of ratings. Questionnaires and ratings are 
not mutually exclusive; questionnaires might include ratings as well as other types of information. See 
pp. 315-318. 

^Including physiological, psychological, and psychiatric appraisals. 
^Including personnel records, publications, financial data, program materials. 
' Census data, crime rates, etc. See pp. 374-377. 
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oriented toward a phenomenological analysis of institutional change 
(VII. B) might utilize information derived from logs and diaries, 
observations, and expert opinion. 

Table 2 may be useful to evaluatars or persons commissioning or 
monitoring evaluations to give more definition to the general methods 
of investigation listed in Table 1, to remind them of the variety of 
sources of evidence that might be used in a particular study, and to 
focus attention on possible dissonances between types of evidence and 
the sources of information on which they are based. 

3. Administrative and Fiscal Dependence-Independence 

of the Evaluator 

To this point we have presented a framework for selecting a set 
of evaluation purposes and general methods of investigation, as well as 
examples of types and sources of evidence. The processes of selecting 
goals, methods, etc., occur within a political-economic context that is 
frequently ignored in the evaluation literature but which, nevertheless, 
can exert a profound influence on the evaluation.' 

The principal actors in the scene are the funding agent (s), the 
program dir ec tor /developer , and the evaluator. Of course, choruses can 
substitute for one or more of the actors (e.g., a funding consortium, 
a program development committee). Of most concern here is the position 
of the evaluator, who can be dependent upon, related to, or independent 
of. the other actors. 

"Dependency" has two aspects: administrative and financial. The 
evaluator is administratively dependent upon the program director when 
he is required to report to the program director in some institutionalized 
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way. He is financially dependent on the program director when the 
program director controls the funds available for the evaluation. The 
evaluator is administratively independent when he reports to an external 
authority and financially independent when funds for the evaluation are 
allocated directly to him by an agency that has no other connection 
with the program. 

ft Relatedness M occurs either when the evaluator and program director 
report to the same administrative authority (e.g., a board of education, 
company vice president, or economic development council) or when funds 
for program operation/development and for the evaluation stem from the 
same agency (e.g., a foundation or government source). 

These relationships are graphically represented in Table 3 and 
are determined by the answers to two simple questions: 

1. Who does the evaluator report to? 

The program director (administratively dependent) 
The same authority as the program director (administratively 
related) 

An independent authority (administratively independent) 

2. Where do the funds for the evaluation come from? 
The program director (financially dependent) 

The same funding source as the program (financially related) 
An independent funding source (financially independent) 
On the surface j it might seem that the more independent the evaluator, 
the better the evaluation. Further consideration does not necessarily 
provide support for that generalization. There are advantages and dis- 
advantages in the different categories of relationship, complicated in 
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Table 3 



Dependence-Ir.depender.ee oi She. ^valuator 



Financially Financially Financially 
Dependent Re laced • Independent 

Administratively 
Depeadeac 



Adninistratively 
Related . 



. Adninis tra Cively 
Independent 
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some instances by an interaction between the purpose of the evaluation 
and the kind of relationship that is desirable. Dependent relationships 
may promote responsivity by the eyaluator to particular program needs. 
This can be worthwhile when the purpose of the evaluation is to 
improve the program (formative evaluation). However, dependence can 
be counterproductive when the purpose of the evaluation is to provide 
a credible, global assessment of the program T s impact (Scriven, 1967). 
Skeptics will certainly question evaluation results produced by a 
"captive" evaluator . 

There are instances when it would seem very desirable for the agency 
that funds the program also to fund the evaluation. Indeed, this has 
frequently been the case with large federally funded intervention programs 
or major curriculum projects funded by foundations. Again, the 
advantage is responsivity by the evaluator, this time to the expectations 
of the funding agency. However, even the judgments of such agencies 
can become warped. Having committed themselves heavily to a new program, 
they may become increasingly reluctant to hear anything negative about 
it. They may even reach the point that they tend to fault the evaluator 
rather than the program, a reaction akin to the ancient custom of 
beheading the bearer of bad news. 

Just as there are problems in dependence and problems in relatedness, 
there are also problems in independence. This is vouched for empirically 
by Bernstein and Freeman (1975), who found that the quality of evaluation 
studies (as measured by expert judgments) decreased as the independence 
of the evaluation effort increased. Independence can also be related to 
potential impact of evaluation results. At the extreme, evaluations 
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might be so independent that results would have no bearing on the 
decision needs of program directors or, worse, produce valuable 
information that never reached program directors. 

It is impossible to specify the kind of administrative-financial 
relationships among the evaluator, program director, and funding agent 
that would be universally satisfactory. What can be specified, however, 
is the nature of the relationships among the parties at the outset 
of any particular evaluation; and many potential problems can be 
dissipated by an understanding of and continuing commitment to the 
stipulated relationships. In case of serious violations or disagreements, 
the possibility of some external body to whom the evaluator might turn 
is exciting. Professional organizations concerned with program 
evaluation might well consider whether such a tribunal is practical 
at this time. 

4. Dissemination of Evaluation Results 
Let us assume that the purposes of the evaluation were justifiable, 
that the methods of investigation and resulting evidence were responsive 
to the purposes, and that the evaluation processes were carried out in 
a supportive milieu (rather than one that was politically contentious 
and/or economically impoverished). The evaluator has almost finished 
the task, but not quite. It is time now to disseminate the results of 
the evaluation. The first premise is that if the evaluation was worth 
doing, there are groups who have some interests — perhaps strong ones — in 
the findings. Responsible — or responsive — evaluations include analyses 
of these audiences and inquiries into the kinds of evidence they would 
honor early on in the process (Stake, 1975, p. 29). 
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Table 4 

Dissemination of Evaluation Results 



K-i 1 Likely communication form 

Potential Audience 

Funding agencies for program 
or evaluation 

Progra r\ administrators 

Other relevant management-level staff 

Board members, trustees 

Technical advisory committees 

Relevant political bodies 
(e.g., legislatures) 

Interested community groups 

Current students (guardians where 
appropriate) 

Prospective students 

Instructors 

Professional colleagues of evaluator(s) 

Organizations or professions concerned 
with program content 

Local, state, regional media 

National media 

Other 
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Unfortunately the typical dissemination procedure seems to be to 
provide some thirty copies of a technical evaluation report (bound in a 
nondescript cover) to gather diist on the shelves of the funding agency. 
This is ecologically wasteful if nothing else. Of course written technical 
reports have their place, but evaluation dissemination is too important a 
part of the evaluation process — as feedback is to the learning process — to 
be treated thoughtlessly. 

In Table 4 we present nine ways of communicating (disseminating) 
evaluation results. The choice of the form of communication uas to be 
made in terms of the likely audience for that communication. We have 
listed fourteen potential audiences as a cross-tabulation for the nine 
communication forms, and we also suggest the most appropriate forms for 
each possible audience. For example, the funding agency should certainly 
be given the technical report and the executive summary (a short, 
intelligible presentation of the principal findings, with a minimum of 
jargon). Relevant political groups should receive the executive summary 
and any popular articles based on the evaluation. Local, state, or regional 
media will usually not be interested in technical reports but may be 
interested in receiving news releases, attending press conferences, or 
covering public meetings. 

There are two reasons for including Table 4 in this report: to suggest 
what communication forms are most appropriate for specific audiences and, 
more important, to emphasize the need for evaluators to make a conscious 
iisting of potential audiences for their results and to broader their 
consideration of useful forms of communication. 
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5. Values in Evaluation 

There is considerable argument about the role that the investigator's 
values can and should play in scientific enterprises, with many people 
maintaining that a neutral stance is easential for any scientific endeavor. 
Such arguments frequently fail to distinguish between professional and 
personal values. For example, an investigation of physical phenomena may 
be carried out without any overriding concern for what the physicist 
considers useful to the community — a personal value. However, it is 
virtually impossible. f,b dissociate the investigator's professional- 
scientific values from either the phenomena he or she chooses to study or 
the methods employed in the investigation. 

The same is certainly true of the evaluator of an education/ training 
program. We know that the professional values he holds, based in large 
part on the type of training he has had and the evaluation "model" he 
prefers, influence the choice of evaluation design, measurement techniques, 
methods of analysis, and ways in which the data are interpreted. Even 
more critical are the personal values the evaluator places on the program 
to be evaluated. If he is all "for" early education or prevention of drop 
out or teaching computer programming, we might suspect that his evaluations 
of programs with those contents would be different from, those of a more 
skeptical evaluator. Furthermore, it is possible that the personal 
values he places on the program may interact with, his professional values 
to influence design, measurement, analysis, and interpretation decisions. 

We are inclined to believe that there is no way to remove the evaluator f s 
values from the evaluation process. After all, the word "evaluation" presents 
the centrality of values quite literally. Nor are we convinced that a 
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value-free stance would necessarily be desirable if it were attainable. 
However, there does seem to be a need for evaluators to attempt to make 
their value orientations as specific as possible both before they undertake 
an evaluation and in the processes of carrying it out and reporting the 
results. Unfortunately, evaluators have had few pressures from their 
clients or potential clients to make their values explicit and little 
commitment to analyzing those values for their own self-understanding. 
They have also lacked a convenient means for doing so. 

Table 5 presents a preliminary scheme by which evaluators might 
examine their professional predispositions and preferences. (It does not 
pretend to deal with the issue of personal attitudes toward the objectives, 
content, or operation of specific programs the evaluator might be called 
upon to appraise.) An attempt has been made to describe seven dimensions 
that seem to be central to the evaluator' s professional values and that 
are not necessarily highly correlated. The descriptions take the form of 
labels (e.g., Absolutist-Comparative) and examples of the kinds of design, 
measurement, analysis, and/or interpretation preferences that might be 
associated with the extremes of the dimensions (e.g., within-group analysis 
vs. between-group analysis). The examples might. also be thought of as 
"symptoms" — if an evaluator tends to prefer clinical or case studies to 
experimental or quasi-experimental designs, he is more likely to be 
Phenomenological than Behavioristic. (It should go without saying that 
there is no intent here to attach value judgments to the dimensions 
themselves; Phenomenological is not M good tf and Behavioristic "bad" or t 
vice versa. ) 

Consider an example: One evaluator^might characterize herself as 

"Ji ' " • - " 

leaning more toward Behavioristic (than "Phenomenological) , Comparative^ 4 * 4 
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Tabic 5 



Predispositions and Preferences of Evaluators 
(Including Examples of Design, Measurement, Analysis, and Interpretation 
Preferences Associated with the Principal Dimensions) 



Design 
Measurement 
Analysis 



PHENOMENOLOGICAL 



Clinical or case study 



Subjective measurement methods, 
content analyses, self-reports 
Descriptive statistics and 
nonparamatric techniques 
Interpretation Judgmental, value-laden 



' BEHAVIORISTIC 

Experimental or quasi-experimental 
design 

Objective measurement nethods, 

tests » systematic observations 
Inferential statistics 

Nonjudgmental 



Design 



ABSOLUTIST 
One-group design 



Analysis Within-group analysis 

Interpretation Standard-referenced 



INDEPENDENT 
Measurement Goal-free measures 
Interpretation Nonclient-oriented 



COMPARATIVE 

« 

Experimental or quasi-experimental 
design .with comparison group (s) 

Between-group analysis 
Comparison-group - referenced 



DEPENDENT 



Measures tailored to program 
goals 

Goal-referenced, client-oriented 



PRAGMATIC 

Design Widely varying 

Measurement Ad hoc measures, records 

Analysis Widely varying 

Interpretation Program-specific conclusions, 

little generalization 

(ideographic) 



THEORETICAL 

Experimental or quasi-experimental 
design (hypothesis testing) 

Established measures, construct 
validity emphasized 

Inferential statistics 
Hypothesis confirmation, 
generalization (nomothetic) 



NARROW SCOPE 



BROAD SCOPE 



Measurement 

Analysis 

Interpretation 



Few and specific measures 

Univariate contrasts 

Oriented toward component 
functioning 



tony and global measures 
Multivariate analyses 
Oriented toward systea 
functioning 
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Table 5 (Continued) 



HIGH INTENSIVE 

Design Repeated measurement occasions 

(longitudinal) 
Measurement Multi- trait, multi-method 

(triangulation) 
Analysis Multivariate analyses, including 

factor analyses 
Interpretation Generalization 



LOW INTENSIVE 

Infrequent measurement occasions 

(perhaps cross-sectioaal) 
Survey tests 

Univariate analyses, descriptive 

statistics 
Description 



Design 



PROCESS 

Repeated measurement occasions 



Measurement Observations, logs, interviews 
Analysis Descriptive statistics 

Interpretation Recommendations for program 
improvement 



PRODUCT 

Experimental or quasi-experimental 
design, infrequent measurement 
occasions • 

Tests 

Inferential statistics 
Recommendations for program 

continuation, expansion, 

"accreditation" 
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(than Absolutist), Independent (than Dependent), Pragmatic (than Theoretical), 
Broad Scope (than Narrow Scope), High Intensive (than Low Intensive), and 
Process (than Product). Another evaluator might characterize himself as 
different on two of these dimensions, describing himself as more Dependent 
and Narrow Scope. Other things being equal, we would expect the second 
evaluator to develop an evaluation plan different from the first 
evaluator *s, with measures tailored more specifically to the program or 
client's goals, fewer and less global measures, and relatively more 
emphasis on the functioning of program components. 

As matters currently stand, evaluation critics point out that the 
conclusions of two evaluations of the same program could easily bear little 
resemblance to. one another simply because fT they were conducted by different 
evaluators (see Shapiro, 1973). Use of a scheme such as that provided in 
Table 5 may contribute to explicit predictions of such outcomes on the ■ 
basis of evaluators* predispositions and preferences; e.g., an evaluator 
most concerned with process and an evaluator with a product orientation 
might give very different reports on a program. In any case, it should be 
a salutary experience for evaluators to attempt to analyze their own 
professional values and disentangle them from their conclusions. 
6. Professional Competencies of the Evaluator 

The training of program evaluators is an educational enterprise of 
rather recent vintage. Until the past few years evaluators were drawn- 
into the profession by the work to be done — or by the lack of work in related 
social science fields. Psychologists, educators, sociologists, economists, 
and anthropologists have all done a stint in the field* Some have written 



We hope to develop the "scales 11 in Table 5 further and investigate 
their psychometric properties. Comments or reports by those who try to use 
Table 5 (or some variant) would be very useful. 

® 25 o -i 

ERJC 2il 



a critical note here or a how-to-do-it chapter there and then returned 
to the haven of their basic discipline. Others have stayed, some to 
try to invent new programs to train evaluators. 

The nature of these inventions varies from those designed to train 
evaluators directly to those designed to train them inductively — or by 
osmosis. Some department chairmen insist on a substantive major 
(e.g., social psychology) with a program evaluation minor. Many insist 
that future evaluators at least need a thorough grounding in "basics" 
before they get into applications. Definitions of "basics" vary, but 
frequently include such areas as experimental design, survey techniques, 
and educational philosophy. There are others who think that training 
in educational research and measurement per se qualifies a program 
evaluator. There is disagreement too about the degree to which some of 
the popular terms in the field represent "jargon" as opposed to real 
substance that future evaluators need to become thoroughly acquainted 
with. Some of the "models" of evaluation are cited as examples; e.g., 
"CIPP," "Discrepancy," "Goal-free"— see Stufflebeam et-al. (1971), 
Provus (1971), and Scriven (1972), respectively. Part of the confusion 
centers around people ? s perception of evaluation as a discipline or a 
profession, as opposed to a job. The latter perception is associated 
with an anti-formal-training bias and advocacy of "internship" or 
"in-service" experiences. 

As this article suggests, we are inclined toward the discipline or 
profession point of view. However, we do not believe that the etiology 
of the evaluator 1 s skills is of paramount importance. What is important 
is that those skills exist. An evaluator may have the necessary skills 

30 

26 



and knowledge personally, or he may have sufficient sense to obtain 
technical consultation in areas where he is deficient • Either way is 
acceptable, although we would feel more comfortable if a person with 
major evaluation responsibilities had to obtain technical consultations 
only occasionally. (Consider, as an analogy, the level of skill you 
would prefer in your medical doctor.) 

Of course, it is possible to have necessary skills for evaluation 
without much practical experience. Again, however, we would feel more 
comfortable entrusting major responsibility for aa evaluation to someone 
who has had some practice. (Return again to the medical analogy and 
consider your selection of the surgeon who is to operate on you.) 

At the top of Table 6 is a fourfold scheme to aid in assessing an 
evaluator f s competencies. Clearly the highest "score" (see the cell 
marked 4) would be earned by an experienced evaluator with need for only 
minimal technical consultation. The least competent level is represented 
by the cell marked 1 and defines an inexperienced evaluator with 
considerable need for technical consultation. Somewhere between these 
extremes are the other two cells. Their ordering would probably depend 
on situational factors. 

The cells are placeholders for the content and skill areas listed 
below them in Table 6. The listing is an eclectic one, derived from 
the panel meeting mentioned in the first footnote of this article and 
a variety of other sources, and covers considerable ground. It could 
serve as the basis for a full program of graduate studies. In practice, 
we would not expect any single evaluator to obtain a "score" of 4 as 
each area is substituted in the matrix. However, we hope that evaluators, 
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Table 6 



Professional Competencies of the Evnluator 
Evaluation of Coapetencies: 



Knowledge or skill sufficient to select appropriate rcodel(s) 
and techniques, and to design and implement evaluation 



with with' 
technical consultation minimal technical consultation 



Minimum or 
no field 
experience 


1 - Lowest Competence 




Relevant 
successful 

field 
experience 




A - Highest Competence 



Content Areas: 



Experimental design 
Quasi-experimental design 
Survey methods 
.Sampling 

Case-study methodology 
Field operations 

Legal and professional standards for 

empirical studies 
Techniques of setting goals and 

performance standards 
Job analysis 

Alternative models for program 

evaluation 
Major literature and reference sources 

useful for evaluators 
Methods of controlling quality of data 

collection and analysis 
Data preparation and reduction 



Applications of observation techniques, 

unobtrusive measures 
Applications of interviews, questionnaires 

ratings 

Applications of tests (paper-arid-pencil, 

situational, performance, etc.) 
Content analysis 

Psychome tries (reliability, validity, 

scaling, equating, etc*) 
Reactive concerns in measurement and 

evaluation 
Descriptive statistics 
Inferential statistics 
Statistical analysis 
Correlation and regression methods 
Cost-benefits analysis 
Contracts and proposals 
Major constructs in education and the 

social sciences 



Special-Skills and Sensitivities : 

Management skills 
Public relations skills 
Interpersonal skills 



Expository skills (speaking and writing) 
Professional and ethical sensitivity 
Sensitivity to concerns of all interested 
parties 



evaluat or s-in- training, and those who train evaluators will be able to 
use the list (or a modified version) to check their training programs 
and personal competencies. In addition, the list should offer those 
who employ evaluators and commission evaluation efforts guidance about 
some of the knowledges and skills they tr.ight look for in potential 
evaluators. 

7. Ethical Responsibilities of the Evaluator 
and Others Involved in Program Evaluation 

If there is a more neglected issue in program evaluation than this 
one ; it has been so neglected as to be no longer discernible. The 
evaluator works in a value-laden, often politically volatile, pressuraful 
area. His conclusions have potential power: large-scale programs can 
be terminated, program components can be given greater emphasis, 
reputations and careers can be made or broken. Yet in this highly charged 
setting there are no credos for the evaluator, no statements of 
responsibilities for the various actors in the evaluation process, and 
few agreed-upon standards of professional behavior. 

With some diffidence we present Table 7. A statement of ethical 
responsibilities should come — and eventually has to come — from one or 
more of the professional organizations to which evaluators belong* But 
professional organizations are usually conservative and their committees 
may move slowly, even when the need is great. We hope that the 
statements presented in Table 7 will serve as an interim guide to ethical 
behavior in the evaluation area and a starting point for subsequent 
standard-setting activities by appropriate societies. 
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Table 7 



Ethical Responsibilities 



Evaluator to Client, Participants, 
Public, and Profession 

1. To acquaint the potential client with 
those values and orientations of the 
evaluator that may bear on the pro- 
posed evaluation effort* 



2. To work toward a contract or "agreement" 
with the client that is ethically, 
legally, and professionally sound. 



3. To refuse to perform work until such 
a contract or "agreement" is reached . 

4. To fulfill the terms of the contract 
or "agreement" to the best of the 
evaluator 's ability. 

5. To acquaint the client promptly with 
problems arising in fulfilling such 
terms and attempt to work out a 
solution. 



in Program Evaluation 



Client, Participants, and 
Secondary Evaluator to Evaluator 

Clien t: To provide the potential evalu- 
ator with as full information as possible 
about the program, the client's expecta- 
tions for the evaluation, and the proposed 
conditions and resources for carrying it 
out. 

Client : To work toward a contract or 
"agreement" with the evaluator that is 
ethically, legally, and professionally 
sound. 

Client : To refrain from insisting that 
work be performed before such an 
"agreement" is reached. 

Client : To cooperate with the evaluator 
and to fulfill to the best of the client's 
ability any commitments or obligations 
called for in the contract or "agreement." 

Client : To acquaint the evaluator 
promptly with problems associated with 
the program that may affect the evaluation 
effort; to work with the evaluator in 
attempting to solve any mutual problems 
that arise. 



Definitions used in the presentation: 

Program - institution, organization, activities, and/or materials with an 
education/training function. 

Evaluator - person (s) or agency with major responsibility for planning, carrying 
out, and reporting evaluation activities (see Table 1). May be 
independent or dependent (see Table 3) • * 

Client - person (s) or agency with major responsibility for securing the services 
of an evaluator. 

Participants - administrators, instructors, students, and other persons with a role 
in the program being evaluated. 

Secondary - person(s) or agency engaging in critical review of evaluation 
Evaluator activities. May include reanalysis of previously collected data. 
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Table 7 (continued) 



Evaluator Co Client, Participants, 
Public, and Profession 

6. To adhere to relevent professional/ 
legal standards and ehtnics in the 
conduct of the evaluation, including 
appropriate provisions for privacy and 
informed consent of participants and 
confidentiality of data. 

7. To carry out data collection and other 
evaluation activities with as little 
interference as practicable with the 
operation of the program. 



10. 



To acquaint the client with any as- 
pects of program philosophy or 
operation that do not appear to be 
ethically, legally, or professionally 
sound but are observed by the 
evaluator, even if such observation 
is not part of the evaluator 1 s 
specific charge; in addition, to 
inform the appropriate authority if' 
the evaluator obtains evidence of 
legal misconduct by the client. 

To acquaint the client, in advance of 
any response, with requests received 
by the evaluator from superordinate 
agencies for information (testimony, 
etc.) about the program or evaluation; 
to ascertain with the client whether 
such requests are valid; if so, to 
acquaint the client fully with the 
nature of the response. 

.To present a "balanced" report of 
results to the client in timely 
fashion and in a form usable to the 
client; to spell out limitations of 
the investigation, along with the 
evaluator 1 s values and orientations 
that may bear on the conclusions. 



Client, Participants, and 
Secondary Evaluator to Evaluator 

Client ; To support the evaluator' s ad- 
herence to relevant professional/legal 
standards and ethics in the conduct of 
the evaluation. 



Client ; To encourage full and honest 
cooperation by program participants in 
supplying data needed for the evaluation 
effort. 

Participants : To cooperate in the data 
collection effort associated with the 
evaluation and to provide accurate 
information in response to legitimate 
requests • 

Client ; To recognize the evaluator 1 s 
"amicus" rcla in noting ethical, legal, 
or professional problems associated 
with the program; to give serious con- 
sideration to the evaluator 1 s observa- 
tions in this area. 



Client : To advise the evaluator on the 
validity of requests for information 
from superordinate agencies* 



Client : To discourage misinterpretation 
and misuse of the evaluation results. 
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Table 7 (continued) 



Evaluator to Client, Participants) Client, Participants, and 
Public, and Profession Secondary Evaluator to Evaluator 

11. To reserve the right to publish re- 
joinders to any misinterpretation or 
misuse of the evaluation results by 
the client. 



12. To identify other groups that have a 
legitimate concern for the results of 
the evaluation and to make the results 
available to them. 



13. Tc zllow interested professionals to 
' examine the data produced by the 

evaluation, within the limitations of 
accepted standards for privacy, 
confidentiality, and informed consent 
related to the purposes for which the 
data were collected. 



14. To publish rejoinders to any mis- 
interpretation or misuse by the 
secondary evaluator of the original 
evaluation' data or results. 



Client : To advise the evaluator about 
groups that, to the client T s knowledge, 
have a legitimate interest in the results 
of the evaluation; to encourage dissemi- 
nation of results to such groups. 

Secondary evaluator : To specify, at the 
time when permission is sought to review 
the evaluation data, the purposes of the 
secondary evaluation effort; to maintain 
professional and ethical standards in 
conducting the secondary evaluation, 
including honoring any relevant commit- 
ments to those who supplied the original 
data; to report in a professionally 
sound manner on the results of the 
secondary evaluation. 



15. To share with professional colleagues 
and relevant agencies and institutions 
knowledge and opinion about education- 
al and social processes derived from 
evaluation studies. 
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Table 7 is divided into two columns. The first column lists some 
of the evaluator 1 s principal responsibilities to the client, program 
participants, the public, and the profession of evaluation (if we may so 
designate it here). The second column lists some of the responsibilities 
of the client, program participants, and secondary evaluator to the 
evaluator. These terms are defined at the beginning of the table. 

Virtually every statement in the table could be discussed at great 
length. Examples cf noncompliance could be presented. Analogies with 
other disciplines could be drawn. We have decided not to do any of these 
things, because the statements themselves are what we want to draw 
attention to. There are, however, some generalizations that should be 
made: 

. . .The evaluator 1 s responsibilities go well beyond simply carrying 

out a competent investigation. 
. . .The evaluator has the responsibility to say "No" if that is the 

ethical stance. It is no defense to say: M They made me do it. 
. . .Evaluation processes should be as open as t possible, consonant 

with the rights of participants and the smooth working of the 

program. 

. . .For almost every statement of responsibility of the evaluator 
there is a complementary responsibility of some other person 
or group. 

. . .Separate ethical standards are not suggested in all areas (e.g. 
with respect to protection of the rights of human subjects); 
evaluators should refer to accepted standards in related 
professions . 
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********** 



In this paper we have drawn attention to a number of neglected issues 
in the practice of program evaluation and we have suggested some schemes 
to help reduce this neglect. Failure to attend to an issue is frequently 
a matter of not being reminded forcefully enough that it exists. We 
hope that the checklists and tables in this article will serve as reminders 
to evaluators and those they serve of the diverse purposes and general 
methods of evaluation; of types and sources of evidence associated with 
general methods of evaluation; of the importance of disseminating evaluation 
results and of some useful dissemination techniques; of the complex 
fiscal-administrative relationships that may obtain among funding agencies, 
program directors, and evaluators; of the professional predispositions 
and preferences of evaluators that may influence what ^they look at and 
how they look at it; of some of the competencies that evaluators tfeed 
and that can serve as a basis for assessment (including self-assessment; 
and training of evaluators; and of the ethical responsibilities bound up 
in program evaluation. In short, it is our hope that this article will 
aid in the establishment of a systematic, scientific discipline. 

********** 
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